Overview

Dataset statistics

Number of variables18
Number of observations70692
Missing cells0
Missing cells (%)0.0%
Duplicate rows3971
Duplicate rows (%)5.6%
Total size in memory9.7 MiB
Average record size in memory144.0 B

Variable types

Numeric4
Categorical14

Alerts

Dataset has 3971 (5.6%) duplicate rowsDuplicates
GenHlth is highly overall correlated with DiffWalkHigh correlation
DiffWalk is highly overall correlated with GenHlthHigh correlation
CholCheck is highly imbalanced (83.3%)Imbalance
HvyAlcoholConsump is highly imbalanced (74.5%)Imbalance
Stroke is highly imbalanced (66.4%)Imbalance
Diabetes is uniformly distributedUniform
MentHlth has 48091 (68.0%) zerosZeros
PhysHlth has 39915 (56.5%) zerosZeros

Reproduction

Analysis started2023-07-02 06:18:45.663656
Analysis finished2023-07-02 06:19:01.858343
Duration16.19 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Age
Real number (ℝ)

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.5840548
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size552.4 KiB
2023-07-02T02:19:01.954342image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q17
median9
Q311
95-th percentile13
Maximum13
Range12
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.8521531
Coefficient of variation (CV)0.33226176
Kurtosis-0.21323222
Mean8.5840548
Median Absolute Deviation (MAD)2
Skewness-0.54592277
Sum606824
Variance8.1347774
MonotonicityNot monotonic
2023-07-02T02:19:02.170342image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
10 10856
15.4%
9 10112
14.3%
8 8603
12.2%
11 8044
11.4%
7 6872
9.7%
13 5426
7.7%
12 5394
7.6%
6 4648
6.6%
5 3520
 
5.0%
4 2793
 
4.0%
Other values (3) 4424
6.3%
ValueCountFrequency (%)
1 979
 
1.4%
2 1396
 
2.0%
3 2049
 
2.9%
4 2793
 
4.0%
5 3520
 
5.0%
6 4648
6.6%
7 6872
9.7%
8 8603
12.2%
9 10112
14.3%
10 10856
15.4%
ValueCountFrequency (%)
13 5426
7.7%
12 5394
7.6%
11 8044
11.4%
10 10856
15.4%
9 10112
14.3%
8 8603
12.2%
7 6872
9.7%
6 4648
6.6%
5 3520
 
5.0%
4 2793
 
4.0%

Sex
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
38386 
1.0
32306 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 38386
54.3%
1.0 32306
45.7%

Length

2023-07-02T02:19:02.396347image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:02.633346image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 38386
54.3%
1.0 32306
45.7%

Most occurring characters

ValueCountFrequency (%)
0 109078
51.4%
. 70692
33.3%
1 32306
 
15.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 109078
77.2%
1 32306
 
22.8%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 109078
51.4%
. 70692
33.3%
1 32306
 
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 109078
51.4%
. 70692
33.3%
1 32306
 
15.2%

HighChol
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
1.0
37163 
0.0
33529 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 37163
52.6%
0.0 33529
47.4%

Length

2023-07-02T02:19:02.849344image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:03.268214image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 37163
52.6%
0.0 33529
47.4%

Most occurring characters

ValueCountFrequency (%)
0 104221
49.1%
. 70692
33.3%
1 37163
 
17.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 104221
73.7%
1 37163
 
26.3%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 104221
49.1%
. 70692
33.3%
1 37163
 
17.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 104221
49.1%
. 70692
33.3%
1 37163
 
17.5%

CholCheck
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
1.0
68943 
0.0
 
1749

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 68943
97.5%
0.0 1749
 
2.5%

Length

2023-07-02T02:19:03.511210image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:03.738213image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 68943
97.5%
0.0 1749
 
2.5%

Most occurring characters

ValueCountFrequency (%)
0 72441
34.2%
. 70692
33.3%
1 68943
32.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 72441
51.2%
1 68943
48.8%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 72441
34.2%
. 70692
33.3%
1 68943
32.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 72441
34.2%
. 70692
33.3%
1 68943
32.5%

BMI
Real number (ℝ)

Distinct80
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.856985
Minimum12
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size552.4 KiB
2023-07-02T02:19:03.969213image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile21
Q125
median29
Q333
95-th percentile43
Maximum98
Range86
Interquartile range (IQR)8

Descriptive statistics

Standard deviation7.1139539
Coefficient of variation (CV)0.23826765
Kurtosis7.1640805
Mean29.856985
Median Absolute Deviation (MAD)4
Skewness1.7191802
Sum2110650
Variance50.608339
MonotonicityNot monotonic
2023-07-02T02:19:04.255212image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
27 6327
 
9.0%
26 4975
 
7.0%
28 4583
 
6.5%
24 4392
 
6.2%
30 4344
 
6.1%
29 4219
 
6.0%
25 4031
 
5.7%
31 3753
 
5.3%
32 3481
 
4.9%
23 3315
 
4.7%
Other values (70) 27272
38.6%
ValueCountFrequency (%)
12 1
 
< 0.1%
13 8
 
< 0.1%
14 8
 
< 0.1%
15 30
 
< 0.1%
16 70
 
0.1%
17 170
 
0.2%
18 366
 
0.5%
19 691
 
1.0%
20 1256
1.8%
21 2028
2.9%
ValueCountFrequency (%)
98 4
 
< 0.1%
95 4
 
< 0.1%
92 9
< 0.1%
89 4
 
< 0.1%
87 13
< 0.1%
86 1
 
< 0.1%
85 1
 
< 0.1%
84 13
< 0.1%
83 1
 
< 0.1%
82 11
< 0.1%

Smoker
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
37094 
1.0
33598 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0 37094
52.5%
1.0 33598
47.5%

Length

2023-07-02T02:19:04.500214image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:04.736216image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 37094
52.5%
1.0 33598
47.5%

Most occurring characters

ValueCountFrequency (%)
0 107786
50.8%
. 70692
33.3%
1 33598
 
15.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 107786
76.2%
1 33598
 
23.8%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 107786
50.8%
. 70692
33.3%
1 33598
 
15.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 107786
50.8%
. 70692
33.3%
1 33598
 
15.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
60243 
1.0
10449 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 60243
85.2%
1.0 10449
 
14.8%

Length

2023-07-02T02:19:04.958223image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:05.210179image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 60243
85.2%
1.0 10449
 
14.8%

Most occurring characters

ValueCountFrequency (%)
0 130935
61.7%
. 70692
33.3%
1 10449
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 130935
92.6%
1 10449
 
7.4%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 130935
61.7%
. 70692
33.3%
1 10449
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 130935
61.7%
. 70692
33.3%
1 10449
 
4.9%

PhysActivity
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
1.0
49699 
0.0
20993 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 49699
70.3%
0.0 20993
29.7%

Length

2023-07-02T02:19:05.416183image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:05.728192image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 49699
70.3%
0.0 20993
29.7%

Most occurring characters

ValueCountFrequency (%)
0 91685
43.2%
. 70692
33.3%
1 49699
23.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 91685
64.8%
1 49699
35.2%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 91685
43.2%
. 70692
33.3%
1 49699
23.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 91685
43.2%
. 70692
33.3%
1 49699
23.4%

Fruits
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
1.0
43249 
0.0
27443 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 43249
61.2%
0.0 27443
38.8%

Length

2023-07-02T02:19:06.002176image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:06.282186image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 43249
61.2%
0.0 27443
38.8%

Most occurring characters

ValueCountFrequency (%)
0 98135
46.3%
. 70692
33.3%
1 43249
20.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 98135
69.4%
1 43249
30.6%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 98135
46.3%
. 70692
33.3%
1 43249
20.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 98135
46.3%
. 70692
33.3%
1 43249
20.4%

Veggies
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
1.0
55760 
0.0
14932 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 55760
78.9%
0.0 14932
 
21.1%

Length

2023-07-02T02:19:06.519181image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:06.777183image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 55760
78.9%
0.0 14932
 
21.1%

Most occurring characters

ValueCountFrequency (%)
0 85624
40.4%
. 70692
33.3%
1 55760
26.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 85624
60.6%
1 55760
39.4%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 85624
40.4%
. 70692
33.3%
1 55760
26.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 85624
40.4%
. 70692
33.3%
1 55760
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
67672 
1.0
 
3020

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 67672
95.7%
1.0 3020
 
4.3%

Length

2023-07-02T02:19:07.005181image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:07.410540image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 67672
95.7%
1.0 3020
 
4.3%

Most occurring characters

ValueCountFrequency (%)
0 138364
65.2%
. 70692
33.3%
1 3020
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 138364
97.9%
1 3020
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 138364
65.2%
. 70692
33.3%
1 3020
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 138364
65.2%
. 70692
33.3%
1 3020
 
1.4%

GenHlth
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
3.0
23427 
2.0
19872 
4.0
13303 
1.0
8282 
5.0
5808 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row3.0
3rd row1.0
4th row3.0
5th row2.0

Common Values

ValueCountFrequency (%)
3.0 23427
33.1%
2.0 19872
28.1%
4.0 13303
18.8%
1.0 8282
 
11.7%
5.0 5808
 
8.2%

Length

2023-07-02T02:19:07.747536image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:08.046535image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
3.0 23427
33.1%
2.0 19872
28.1%
4.0 13303
18.8%
1.0 8282
 
11.7%
5.0 5808
 
8.2%

Most occurring characters

ValueCountFrequency (%)
. 70692
33.3%
0 70692
33.3%
3 23427
 
11.0%
2 19872
 
9.4%
4 13303
 
6.3%
1 8282
 
3.9%
5 5808
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 70692
50.0%
3 23427
 
16.6%
2 19872
 
14.1%
4 13303
 
9.4%
1 8282
 
5.9%
5 5808
 
4.1%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 70692
33.3%
0 70692
33.3%
3 23427
 
11.0%
2 19872
 
9.4%
4 13303
 
6.3%
1 8282
 
3.9%
5 5808
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 70692
33.3%
0 70692
33.3%
3 23427
 
11.0%
2 19872
 
9.4%
4 13303
 
6.3%
1 8282
 
3.9%
5 5808
 
2.7%

MentHlth
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.752037
Minimum0
Maximum30
Zeros48091
Zeros (%)68.0%
Negative0
Negative (%)0.0%
Memory size552.4 KiB
2023-07-02T02:19:08.374535image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation8.1556266
Coefficient of variation (CV)2.173653
Kurtosis4.4915475
Mean3.752037
Median Absolute Deviation (MAD)0
Skewness2.3881096
Sum265239
Variance66.514244
MonotonicityNot monotonic
2023-07-02T02:19:08.665532image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 48091
68.0%
30 4320
 
6.1%
2 3267
 
4.6%
5 2519
 
3.6%
1 2051
 
2.9%
3 1967
 
2.8%
10 1924
 
2.7%
15 1767
 
2.5%
20 1125
 
1.6%
4 981
 
1.4%
Other values (21) 2680
 
3.8%
ValueCountFrequency (%)
0 48091
68.0%
1 2051
 
2.9%
2 3267
 
4.6%
3 1967
 
2.8%
4 981
 
1.4%
5 2519
 
3.6%
6 288
 
0.4%
7 825
 
1.2%
8 198
 
0.3%
9 28
 
< 0.1%
ValueCountFrequency (%)
30 4320
6.1%
29 53
 
0.1%
28 99
 
0.1%
27 17
 
< 0.1%
26 17
 
< 0.1%
25 425
 
0.6%
24 10
 
< 0.1%
23 13
 
< 0.1%
22 22
 
< 0.1%
21 84
 
0.1%

PhysHlth
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.810417
Minimum0
Maximum30
Zeros39915
Zeros (%)56.5%
Negative0
Negative (%)0.0%
Memory size552.4 KiB
2023-07-02T02:19:08.956535image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q36
95-th percentile30
Maximum30
Range30
Interquartile range (IQR)6

Descriptive statistics

Standard deviation10.062261
Coefficient of variation (CV)1.7317622
Kurtosis1.1797312
Mean5.810417
Median Absolute Deviation (MAD)0
Skewness1.6573044
Sum410750
Variance101.24909
MonotonicityNot monotonic
2023-07-02T02:19:10.179218image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 39915
56.5%
30 7953
 
11.3%
2 4102
 
5.8%
1 2853
 
4.0%
3 2438
 
3.4%
5 2332
 
3.3%
10 1980
 
2.8%
15 1913
 
2.7%
4 1376
 
1.9%
7 1326
 
1.9%
Other values (21) 4504
 
6.4%
ValueCountFrequency (%)
0 39915
56.5%
1 2853
 
4.0%
2 4102
 
5.8%
3 2438
 
3.4%
4 1376
 
1.9%
5 2332
 
3.3%
6 447
 
0.6%
7 1326
 
1.9%
8 276
 
0.4%
9 55
 
0.1%
ValueCountFrequency (%)
30 7953
11.3%
29 95
 
0.1%
28 211
 
0.3%
27 34
 
< 0.1%
26 26
 
< 0.1%
25 557
 
0.8%
24 24
 
< 0.1%
23 27
 
< 0.1%
22 31
 
< 0.1%
21 229
 
0.3%

DiffWalk
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
52826 
1.0
17866 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 52826
74.7%
1.0 17866
 
25.3%

Length

2023-07-02T02:19:10.560214image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:10.894216image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 52826
74.7%
1.0 17866
 
25.3%

Most occurring characters

ValueCountFrequency (%)
0 123518
58.2%
. 70692
33.3%
1 17866
 
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 123518
87.4%
1 17866
 
12.6%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 123518
58.2%
. 70692
33.3%
1 17866
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 123518
58.2%
. 70692
33.3%
1 17866
 
8.4%

Stroke
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
66297 
1.0
 
4395

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 66297
93.8%
1.0 4395
 
6.2%

Length

2023-07-02T02:19:11.168217image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:11.498224image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 66297
93.8%
1.0 4395
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0 136989
64.6%
. 70692
33.3%
1 4395
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 136989
96.9%
1 4395
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 136989
64.6%
. 70692
33.3%
1 4395
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 136989
64.6%
. 70692
33.3%
1 4395
 
2.1%

HighBP
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
1.0
39832 
0.0
30860 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row0.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 39832
56.3%
0.0 30860
43.7%

Length

2023-07-02T02:19:11.792227image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:12.133597image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1.0 39832
56.3%
0.0 30860
43.7%

Most occurring characters

ValueCountFrequency (%)
0 101552
47.9%
. 70692
33.3%
1 39832
 
18.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 101552
71.8%
1 39832
 
28.2%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 101552
47.9%
. 70692
33.3%
1 39832
 
18.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 101552
47.9%
. 70692
33.3%
1 39832
 
18.8%

Diabetes
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size552.4 KiB
0.0
35346 
1.0
35346 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters212076
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 35346
50.0%
1.0 35346
50.0%

Length

2023-07-02T02:19:12.440601image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-02T02:19:12.735603image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 35346
50.0%
1.0 35346
50.0%

Most occurring characters

ValueCountFrequency (%)
0 106038
50.0%
. 70692
33.3%
1 35346
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 141384
66.7%
Other Punctuation 70692
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 106038
75.0%
1 35346
 
25.0%
Other Punctuation
ValueCountFrequency (%)
. 70692
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 212076
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 106038
50.0%
. 70692
33.3%
1 35346
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 212076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 106038
50.0%
. 70692
33.3%
1 35346
 
16.7%

Interactions

2023-07-02T02:18:59.410727image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:50.508405image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:54.398369image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:57.279251image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:59.819454image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:51.361416image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:55.575059image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:57.837250image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:19:00.025450image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:52.199408image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:56.124059image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:58.449005image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:19:00.273300image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:53.351838image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:56.714258image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-02T02:18:59.017718image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-07-02T02:19:12.991596image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
AgeBMIMentHlthPhysHlthSexHighCholCholCheckSmokerHeartDiseaseorAttackPhysActivityFruitsVeggiesHvyAlcoholConsumpGenHlthDiffWalkStrokeHighBPDiabetes
Age1.000-0.038-0.1680.0440.0340.2690.1070.1220.2240.1030.0820.0260.0580.0910.1980.1240.3480.294
BMI-0.0381.0000.0870.1670.1240.1360.0470.0260.0610.1700.0820.0570.0540.1440.2500.0210.2440.295
MentHlth-0.1680.0871.0000.3450.1060.0850.0120.0920.0760.1330.0650.0530.0230.1810.2560.0880.0670.089
PhysHlth0.0440.1670.3451.0000.0730.1500.0340.1210.2020.2350.0500.0660.0360.3280.4940.1640.1830.221
Sex0.0340.1240.1060.0731.0000.0170.0070.1120.0980.0520.0890.0520.0140.0320.0820.0000.0410.044
HighChol0.2690.1360.0850.1500.0171.0000.0860.0930.1810.0900.0470.0430.0250.2430.1620.1000.3160.289
CholCheck0.1070.0470.0120.0340.0070.0861.0000.0020.0430.0070.0170.0000.0270.0620.0440.0220.1030.115
Smoker0.1220.0260.0920.1210.1120.0930.0021.0000.1240.0800.0750.0300.0780.1530.1200.0640.0870.086
HeartDiseaseorAttack0.2240.0610.0760.2020.0980.1810.0430.1241.0000.0980.0190.0360.0370.2870.2330.2230.2110.211
PhysActivity0.1030.1700.1330.2350.0520.0900.0070.0800.0981.0000.1340.1490.0190.2770.2770.0800.1360.159
Fruits0.0820.0820.0650.0500.0890.0470.0170.0750.0190.1341.0000.2390.0330.1000.0510.0080.0410.054
Veggies0.0260.0570.0530.0660.0520.0430.0000.0300.0360.1490.2391.0000.0220.1170.0840.0470.0660.079
HvyAlcoholConsump0.0580.0540.0230.0360.0140.0250.0270.0780.0370.0190.0330.0221.0000.0590.0490.0230.0270.095
GenHlth0.0910.1440.1810.3280.0320.2430.0620.1530.2870.2770.1000.1170.0591.0000.5040.2040.3290.417
DiffWalk0.1980.2500.2560.4940.0820.1620.0440.1200.2330.2770.0510.0840.0490.5041.0000.1920.2350.273
Stroke0.1240.0210.0880.1640.0000.1000.0220.0640.2230.0800.0080.0470.0230.2040.1921.0000.1290.125
HighBP0.3480.2440.0670.1830.0410.3160.1030.0870.2110.1360.0410.0660.0270.3290.2350.1291.0000.381
Diabetes0.2940.2950.0890.2210.0440.2890.1150.0860.2110.1590.0540.0790.0950.4170.2730.1250.3811.000

Missing values

2023-07-02T02:19:00.716289image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-02T02:19:01.397709image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

AgeSexHighCholCholCheckBMISmokerHeartDiseaseorAttackPhysActivityFruitsVeggiesHvyAlcoholConsumpGenHlthMentHlthPhysHlthDiffWalkStrokeHighBPDiabetes
04.01.00.01.026.00.00.01.00.01.00.03.05.030.00.00.01.00.0
112.01.01.01.026.01.00.00.01.00.00.03.00.00.00.01.01.00.0
213.01.00.01.026.00.00.01.01.01.00.01.00.010.00.00.00.00.0
311.01.01.01.028.01.00.01.01.01.00.03.00.03.00.00.01.00.0
48.00.00.01.029.01.00.01.01.01.00.02.00.00.00.00.00.00.0
51.00.00.01.018.00.00.01.01.01.00.02.07.00.00.00.00.00.0
613.01.01.01.026.01.00.01.01.01.01.01.00.00.00.00.00.00.0
76.01.00.01.031.01.00.00.01.01.00.04.00.00.00.00.00.00.0
83.00.00.01.032.00.00.01.01.01.00.03.00.00.00.00.00.00.0
96.01.00.01.027.01.00.00.01.01.00.03.00.06.00.00.00.00.0
AgeSexHighCholCholCheckBMISmokerHeartDiseaseorAttackPhysActivityFruitsVeggiesHvyAlcoholConsumpGenHlthMentHlthPhysHlthDiffWalkStrokeHighBPDiabetes
706829.00.00.01.037.00.00.00.00.00.00.04.00.030.01.00.01.01.0
7068310.00.00.01.028.00.00.00.00.01.00.02.00.00.00.00.01.01.0
706849.01.01.01.027.00.01.01.00.01.00.04.030.05.00.00.01.01.0
706857.00.00.01.038.00.00.01.00.01.00.04.00.00.00.00.01.01.0
7068611.01.01.01.027.00.00.01.01.00.00.04.00.030.00.00.00.01.0
706876.00.01.01.037.00.00.00.00.01.00.04.00.00.00.00.00.01.0
7068810.01.01.01.029.01.01.00.01.01.00.02.00.00.01.00.00.01.0
7068913.00.01.01.025.00.01.00.01.00.00.05.015.00.01.00.01.01.0
7069011.00.01.01.018.00.00.00.00.00.00.04.00.00.01.00.01.01.0
706919.00.01.01.025.00.01.01.01.00.00.02.00.00.00.00.01.01.0

Duplicate rows

Most frequently occurring

AgeSexHighCholCholCheckBMISmokerHeartDiseaseorAttackPhysActivityFruitsVeggiesHvyAlcoholConsumpGenHlthMentHlthPhysHlthDiffWalkStrokeHighBPDiabetes# duplicates
16709.00.00.01.022.00.00.01.01.01.00.01.00.00.00.00.00.00.021
4245.00.00.01.021.00.00.01.01.01.00.01.00.00.00.00.00.00.017
6036.00.00.01.021.00.00.01.01.01.00.01.00.00.00.00.00.00.017
8677.00.00.01.023.00.00.01.01.01.00.01.00.00.00.00.00.00.017
6136.00.00.01.022.00.00.01.01.01.00.01.00.00.00.00.00.00.015
4325.00.00.01.022.00.00.01.01.01.00.01.00.00.00.00.00.00.014
3764.01.00.01.027.00.00.01.01.01.00.01.00.00.00.00.00.00.012
6216.00.00.01.023.00.00.01.01.01.00.01.00.00.00.00.00.00.012
8497.00.00.01.021.00.00.01.01.01.00.01.00.00.00.00.00.00.012
8567.00.00.01.022.00.00.01.01.01.00.01.00.00.00.00.00.00.012